DROPS

Document

DOI: 10.4230/LIPIcs.CPM.2023.8

Optimal Near-Linear Space Heaviest Induced Ancestors

Authors: Panagiotis Charalampopoulos, Bartłomiej Dudek, Paweł Gawrychowski, and Karol Pokorski

Published in: LIPIcs, Volume 259, 34th Annual Symposium on Combinatorial Pattern Matching (CPM 2023)

Abstract

We revisit the Heaviest Induced Ancestors (HIA) problem that was introduced by Gagie, Gawrychowski, and Nekrich [CCCG 2013] and has a number of applications in string algorithms. Let T₁ and T₂ be two rooted trees whose nodes have weights that are increasing in all root-to-leaf paths, and labels on the leaves, such that no two leaves of a tree have the same label. A pair of nodes (u, v) ∈ T₁ × T₂ is induced if and only if there is a label shared by leaf-descendants of u and v. In an HIA query, given nodes x ∈ T₁ and y ∈ T₂, the goal is to find an induced pair of nodes (u, v) of the maximum total weight such that u is an ancestor of x and v is an ancestor of y. Let n be the upper bound on the sizes of the two trees. It is known that no data structure of size 𝒪̃(n) can answer HIA queries in o(log n / log log n) time [Charalampopoulos, Gawrychowski, Pokorski; ICALP 2020]. This (unconditional) lower bound is a polyloglog n factor away from the query time of the fastest 𝒪̃(n)-size data structure known to date for the HIA problem [Abedin, Hooshmand, Ganguly, Thankachan; Algorithmica 2022]. In this work, we resolve the query-time complexity of the HIA problem for the near-linear space regime by presenting a data structure that can be built in 𝒪̃(n) time and answers HIA queries in 𝒪(log n/log log n) time. As a direct corollary, we obtain an 𝒪̃(n)-size data structure that maintains the LCS of a static string and a dynamic string, both of length at most n, in time optimal for this space regime. The main ingredients of our approach are fractional cascading and the utilization of an 𝒪(log n/ log log n)-depth tree decomposition. The latter allows us to break through the Ω(log n) barrier faced by previous works, due to the depth of the considered heavy-path decompositions.

Cite as

Panagiotis Charalampopoulos, Bartłomiej Dudek, Paweł Gawrychowski, and Karol Pokorski. Optimal Near-Linear Space Heaviest Induced Ancestors. In 34th Annual Symposium on Combinatorial Pattern Matching (CPM 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 259, pp. 8:1-8:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{charalampopoulos_et_al:LIPIcs.CPM.2023.8,
  author =	{Charalampopoulos, Panagiotis and Dudek, Bart{\l}omiej and Gawrychowski, Pawe{\l} and Pokorski, Karol},
  title =	{{Optimal Near-Linear Space Heaviest Induced Ancestors}},
  booktitle =	{34th Annual Symposium on Combinatorial Pattern Matching (CPM 2023)},
  pages =	{8:1--8:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-276-1},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{259},
  editor =	{Bulteau, Laurent and Lipt\'{a}k, Zsuzsanna},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2023.8},
  URN =		{urn:nbn:de:0030-drops-179624},
  doi =		{10.4230/LIPIcs.CPM.2023.8},
  annote =	{Keywords: data structures, string algorithms, fractional cascading}
}

Document

DOI: 10.4230/LIPIcs.ISAAC.2020.23

Counting 4-Patterns in Permutations Is Equivalent to Counting 4-Cycles in Graphs

Authors: Bartłomiej Dudek and Paweł Gawrychowski

Published in: LIPIcs, Volume 181, 31st International Symposium on Algorithms and Computation (ISAAC 2020)

Abstract

Permutation σ appears in permutation π if there exists a subsequence of π that is order-isomorphic to σ. The natural algorithmic question is to check if σ appears in π, and if so count the number of occurrences. Only since very recently we know that for any fixed length k, we can check if a given pattern of length k appears in a permutation of length n in time linear in n, but being able to count all such occurrences in f(k)⋅ n^o(k/log k) time would refute the exponential time hypothesis (ETH). Together with practical applications in statistics, this motivates a systematic study of the complexity of counting occurrences for different patterns of fixed small length k. We investigate this question for k = 4. Very recently, Even-Zohar and Leng [arXiv 2019] identified two types of 4-patterns. For the first type they designed an 𝒪̃(n) time algorithm, while for the second they were able to provide an 𝒪̃(n^1.5) time algorithm. This brings up the question whether the permutations of the second type are inherently harder than the first type. We establish a connection between counting 4-patterns of the second type and counting 4-cycles (not necessarily induced) in a sparse undirected graph. By designing two-way reductions we show that the complexities of both problems are the same, up to polylogarithmic factors. This allows us to leverage the work done on the latter to provide a reasonable argument for why there is a difference in the complexities for counting 4-patterns of the first and the second type. In particular, even for the seemingly simpler problem of detecting a 4-cycle in a graph on m edges, the best known algorithm works in 𝒪(m^{4/3}) time. Our reductions imply that an 𝒪(n^{4/3-ε}) time algorithm for counting occurrences of any 4-pattern of the second type in a permutation of length n would imply an exciting breakthrough for counting (and hence also detecting) 4-cycles. In the other direction, by plugging in the fastest known algorithm for counting 4-cycles, we obtain an algorithm for counting occurrences of any 4-pattern of the second type in 𝒪(n^1.48) time.

Cite as

Bartłomiej Dudek and Paweł Gawrychowski. Counting 4-Patterns in Permutations Is Equivalent to Counting 4-Cycles in Graphs. In 31st International Symposium on Algorithms and Computation (ISAAC 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 181, pp. 23:1-23:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)

Copy BibTex To Clipboard

@InProceedings{dudek_et_al:LIPIcs.ISAAC.2020.23,
  author =	{Dudek, Bart{\l}omiej and Gawrychowski, Pawe{\l}},
  title =	{{Counting 4-Patterns in Permutations Is Equivalent to Counting 4-Cycles in Graphs}},
  booktitle =	{31st International Symposium on Algorithms and Computation (ISAAC 2020)},
  pages =	{23:1--23:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-173-3},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{181},
  editor =	{Cao, Yixin and Cheng, Siu-Wing and Li, Minming},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ISAAC.2020.23},
  URN =		{urn:nbn:de:0030-drops-133678},
  doi =		{10.4230/LIPIcs.ISAAC.2020.23},
  annote =	{Keywords: Permutations, pattern avoidance, counting cycles}
}

Document

DOI: 10.4230/LIPIcs.STACS.2020.18

Generalised Pattern Matching Revisited

Authors: Bartłomiej Dudek, Paweł Gawrychowski, and Tatiana Starikovskaya

Published in: LIPIcs, Volume 154, 37th International Symposium on Theoretical Aspects of Computer Science (STACS 2020)

Abstract

In the problem of Generalised Pattern Matching (GPM) [STOC'94, Muthukrishnan and Palem], we are given a text T of length n over an alphabet Σ_T, a pattern P of length m over an alphabet Σ_P, and a matching relationship ⊆ Σ_T × Σ_P, and must return all substrings of T that match P (reporting) or the number of mismatches between each substring of T of length m and P (counting). In this work, we improve over all previously known algorithms for this problem: - For ? being the maximum number of characters that match a fixed character, we show two new Monte Carlo algorithms, a reporting algorithm with time ?(? n log n log m) and a (1-ε)-approximation counting algorithm with time ?(ε^-1 ? n log n log m). We then derive a (1-ε)-approximation deterministic counting algorithm for GPM with ?(ε^-2 ? n log⁶ n) time. - For ? being the number of pairs of matching characters, we demonstrate Monte Carlo algorithms for reporting and (1-ε)-approximate counting with running time ?(√? n log m √{log n}) and ?(√{ε^-1 ?} n log m √{log n}), respectively, as well as a (1-ε)-approximation deterministic algorithm for the counting variant of GPM with ?(ε^-1 √{?} n log^{7/2} n) time. - Finally, for ℐ being the total number of disjoint intervals of characters that match the m characters of the pattern P, we show that both the reporting and the counting variants of GPM can be solved exactly and deterministically in ?(n√{ℐ log m} +n log n) time. At the heart of our new deterministic upper bounds for ? and ? lies a faster construction of superimposed codes, which solves an open problem posed in [FOCS'97, Indyk] and can be of independent interest. To conclude, we demonstrate first lower bounds for GPM. We start by showing that any deterministic or Monte Carlo algorithm for GPM must use Ω(?) time, and then proceed to show higher lower bounds for combinatorial algorithms. These bounds show that our algorithms are almost optimal, unless a radically new approach is developed.

Cite as

Bartłomiej Dudek, Paweł Gawrychowski, and Tatiana Starikovskaya. Generalised Pattern Matching Revisited. In 37th International Symposium on Theoretical Aspects of Computer Science (STACS 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 154, pp. 18:1-18:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)

Copy BibTex To Clipboard

@InProceedings{dudek_et_al:LIPIcs.STACS.2020.18,
  author =	{Dudek, Bart{\l}omiej and Gawrychowski, Pawe{\l} and Starikovskaya, Tatiana},
  title =	{{Generalised Pattern Matching Revisited}},
  booktitle =	{37th International Symposium on Theoretical Aspects of Computer Science (STACS 2020)},
  pages =	{18:1--18:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-140-5},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{154},
  editor =	{Paul, Christophe and Bl\"{a}ser, Markus},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.STACS.2020.18},
  URN =		{urn:nbn:de:0030-drops-118798},
  doi =		{10.4230/LIPIcs.STACS.2020.18},
  annote =	{Keywords: pattern matching, superimposed codes, conditional lower bounds}
}

Search Results

Documents authored by Dudek, Bartłomiej

Optimal Near-Linear Space Heaviest Induced Ancestors

Abstract

Cite as

Counting 4-Patterns in Permutations Is Equivalent to Counting 4-Cycles in Graphs

Abstract

Cite as

Generalised Pattern Matching Revisited

Abstract

Cite as

Thanks for your feedback!

Could not send message